By the end of this section, students will be able to:
R skills
optimize()optim()solnp()Unconstrained optimization seeks to find the maximum or minimum of a function without any restrictions on the variables. This forms the foundation for understanding more complex constrained optimization problems.
For a function \(f: \mathbb{R}\to \mathbb{R}\), we want find the minimum or the maximum of the function \(f(x).\) Since a maximisation problem can always ne formulated a minimisation problem of its negation \(\max_{x\in\mathbb{R}}f(x) \Rightarrow \min_{x\in\mathbb{R}}-f(x)\), we will focus on the minimisation problems.
\[\min_{x \in \mathbb{R}} f(x)\]
Definition 1.1 (First-Order Condition): For a function \(f(x)\) to have a local extremum at \(x = a\), it is necessary that: \[f'(a) = 0\]
Definition 1.2 (Second-Order Condition): For a critical point \(x = a\) where \(f'(a) = 0\):
Remark:
TechStart Inc. wants to maximize profit from their new software product. Market research shows demand follows \(D(p) = 1000 - 10p\) and production costs are \(C(q) = 5000 + 20q\). The profit function is:
\[\pi(p) = p \cdot D(p) - C(D(p)) = p(1000 - 10p) - (5000 + 20(1000 - 10p))\] \[\pi(p) = 1000p - 10p^2 - 25000 + 200p = 1200p - 10p^2 - 25000\]
To find the optimal price, we apply the first and second-order conditions.
For TechStart’s profit function \(\pi(p) = 1200p - 10p^2 - 25000\):
Step 1: First-Order Condition \[\pi'(p) = 1200 - 20p = 0\] \[p^* = 60\]
Step 2: Second-Order Condition \[\pi''(p) = -20 < 0\]
Since \(\pi''(60) < 0\), \(p = 60\) gives a maximum profit.
Step 3: Calculate Maximum Profit \[\pi(60) = 1200(60) - 10(60)^2 - 25000 = 72000 - 36000 - 25000 = 11000\]
profit <- function(p) {
1200*p - 10*p^2 - 25000
}
optimize(profit, interval = c(0, 200), maximum = TRUE)## $maximum
## [1] 60
##
## $objective
## [1] 11000
A firm sells a premium product. Market demand decreases nonlinearly with price: \[q(p) = 1500\, e^{-0.04p},\] where \(p \ge 0\) is the price (in dollars) and \(q(p)\) is quantity demanded.
The firm’s total cost consists of a fixed overhead and a per-unit cost proportional to output: \[C(p) = 3000 \;+\; 18\, q(p).\] Revenue at price \(p\) is \(R(p) = p \cdot q(p).\)
Find the profit maximising quantity \(q\).
Complete the task using optimize() in following R
Environment or in your local RStudio.
For a multivariable function \(f:\mathbb{R}^n \to \mathbb{R}\) we want find the minimum.
\[\min_{x \in \mathbb{R}^n} f(x)\]
Definition 1.3 (First-Order Condition): For a function \(f:\mathbb{R}^n \to \mathbb{R}\) to have a local extremum at \(x = x_0\), it is necessary that: \[\nabla f(x_0) = 0\]
Definition 1.4 (Second-Order Condition): For a critical point \(x = x_0\) where \(\nabla f(x_0) = 0\): \[H = \nabla^2 f(x_0)\]
For \(n=2\), i.e. \(f\) is a function of two variables \(f(x,y)\) we have
\[\nabla f(x) = \begin{pmatrix} f_{x} \\ f_{x} \end{pmatrix}\]
Second-Order Conditions: Define the Hessian matrix: \[H = \begin{pmatrix} f_{xx} & f_{xy} \\ f_{yx} & f_{yy} \end{pmatrix}\]
Sarah owns a coffee shop that sells two main products: coffee and pastries. Her daily revenue depends on both the number of cups of coffee sold \(x\) and the number of pastries sold \(y\). Through market research, she has determined that her revenue function is
\[R(x, y) = 120x + 100y - 2x² - y² - xy\]
How can Sarah maximise her revenue?
Step 1: Find Critical Points by FOC
\(\nabla f =\begin{pmatrix} \frac{\partial R}{\partial x} \\ \frac{\partial R}{\partial y}\end{pmatrix}=\begin{pmatrix} 120-4x-y \\100 -2y -x \end{pmatrix}=\begin{pmatrix} 0 \\ 0\end{pmatrix}\)
Rearrange in matrix form \[\begin{pmatrix} 4 & 1 \\ 1 & 2 \end{pmatrix}\begin{pmatrix} x \\y\end{pmatrix}=\begin{pmatrix} 120 \\100\end{pmatrix} \]
Solve the two equation system
## [,1]
## [1,] 20
## [2,] 40
Step 2: Compute eigenvalues of the Hessian
\[H = \begin{pmatrix} -4 & -1 \\ -1 & -2 \end{pmatrix}\]
## [1] -1.585786 -4.414214
Step 3: Apply Second-Order Test
Since eigen(H)$values < 0, hence the Hessian matrix \(H\) is negative definite, the critical point (20, 40) is a local maximum.
Step 4: Calculate Maximum Profit \[R(20, 40) = 120(20) + 100(40) - 2(20)² - (40)² - (20)(40) = 3200\]
# Define the negative profit function (because optim minimizes by default)
profit <- function(v) {
x <- v[1]
y <- v[2]
- (120*x + 100*y - 2*x^2 - y^2 - x*y) # Negative for maximization
}
# Use optim to maximize profit
optim(par = c(0, 0), fn = profit)## $par
## [1] 20 40
##
## $value
## [1] -3200
##
## $counts
## function gradient
## 163 NA
##
## $convergence
## [1] 0
##
## $message
## NULL
library(plotly)
A<- matrix(c(-4,-1,-1,-2),2,2,byrow=TRUE)
# Create a grid over [-10, 10]
x <- seq(-0.3, 0.3,0.01 )
y <- seq(-0.3, 0.3,0.01)
z <- outer(x, y, function(x, y) {
mapply(function(x, y) {
X <- c(x, y)
t(X) %*% A %*% X
}, x, y)
})
plot_ly(x = x, y = y, z = z, type = "surface") |>
layout(title = "Interactive Surface: z = xáµ€Ax",
scene = list(
xaxis = list(title = "dx"),
yaxis = list(title = "dy"),
zaxis = list(title = "dz")))A company produces three products: \(x\), \(y\), and \(z\).
The total revenue (in thousands of dollars) is modeled by: \[
R(x,y,z) = 50x^{0.5}y^{0.3}z^{0.2} - 0.5x^2 - 0.4y^2 - 0.6z^2 + 2xy - xz
+ yz
\] where: - \(x, y, z \ge 0\)
are the quantities of the three products (in hundreds of units).
Task:
Find the production levels \((x, y,
z)\) that maximize revenue.
Instructions: 1. Write the gradient vector \(\nabla R(x,y,z)\) by computing the partial derivatives with respect to \(x\), \(y\), and \(z\). 2. Set \(\nabla R(x,y,z) = 0\) and solve for the critical point(s). 3. Form the Hessian matrix and evaluate it at each critical point. 4. Use the second-order test for multivariable functions to determine whether each point is a local maximum, local minimum, or saddle point. 5. Interpret the optimal production quantities and the maximum revenue.
Solve the problem numerically by optim() to confirm your
analytic solution. You can complete the task in following R Environment
or in your local RStudio.
GreenInvest, an ESG-focused investment firm, faces a portfolio optimization problem involving the allocation of $100 million across three green tech companies: SolarTech, EcoTransport, and CleanWater. Each offers different returns and risks, requiring the firm to maximize expected returns while strictly adhering to the constraint that total investment equals exactly $100 million.
This constraint transforms the problem from unconstrained optimization to a constrained optimization task. The Lagrange multiplier method is employed to solve it, balancing the objective of return maximization with the investment constraint.
For a multivariable function \(f:\mathbb{R}^n \to \mathbb{R}\) we want find the minimum under \(m\) constraints \(h_i(x)=0\) for \(i = 1,2,...,m\) where each constraint is a multivariable function \(h_i:\mathbb{R}^n \to \mathbb{R}\).
\[\min_{x \in \mathbb{R}^n} f(x)\] \[\text{s.t.} \hspace{0.5cm} h_i(x) = 0, \hspace{0.5cm} i = 1, \dots, m\]
Definition 2.1 (Lagrange Function)
\[\mathcal{L}(x, \lambda) = f(x) + \sum_{i=1}^m \lambda_i h_i(x)\] The multivariable function \(\mathcal{L}(x, \lambda)\) is called Lagrange function.
Definition 2.2 (Lagrange First-Order Condition): For the function \(f:\mathbb{R}^n \to \mathbb{R}\) to have a local extremum at \(x = x_0\) under the constraints \(h_i(x_0)=0\), it is necessary that:
\[ \nabla_x \mathcal{L}(x, \lambda) = \nabla f(x) + \sum_{i=1}^m \lambda_i \nabla h_i(x) = 0, \quad h_i(x) = 0 \quad \text{for all } i\]
Definition 2.3 (Lagrange Second-Order Condition): A critical point \(x = x_0\) is a local minimum if the following condition holds
For all nonzero vectors \(d \in \mathbb{R}^n\) such that
\[\nabla h_i(x^*)^\top d = 0 \text{ for all } i = 1, \dots, m, \\[0.5em]\]
\[d^\top \nabla^2_{xx} \mathcal{L}(x^*, \lambda) d > 0.\]
At the constrained optimum, the surface of the objective along the constraint should at its local optimum. This implies the constraint line must be the tangent of the level line at that optimum point, otherwise along the constraint line the objective function would increase or decrease.
The fact that the constraint line is the tangent of the level line at the optimum in mathematical term is that the gradient of the objective surface \(\nabla f\) is parallel to the gradient of the constraint \(\nabla g\). This leads to
\[\nabla f = \lambda \nabla g\] Moving all terms to the left hand side we have \[\nabla f - \lambda \nabla g = 0\], and defining it the gradient of a function \(f-\lambda g\), this lead to the Lagrange method: we can find the critical point with the first order condition:
\[ \nabla \mathcal{L}(x) = \nabla f(x) - \lambda \nabla g(x) = 0\] In many cases the constraints are nonlinear, however at the critical point the local behaviour of a nonlinear curve is like the behaviour of the tangent of the constraint at the point. Therefore, the argument above is still valid. In linear cases \(\nabla g(x)\) is a constant vector and in nonlinear cases \(\nabla g(x)\) varies with \(x\).
A firm wants to allocate its $100000 advertising budget between two channels:
The estimated revenue generated from these investments is: \[ R(x, y) = 200\sqrt{x} + 300\sqrt{y} \] The firm must satisfy the total budget constraint: \[ x + y = 100{,}000 \]
We formulate this business problem as an maximisation problem subject to the equality constraint: \[ \max_{x, y} \quad R(x, y) = 200\sqrt{x} + 300\sqrt{y} \] \[ \text{subject to } x + y = 100000 \]
Lagrangian Method
We form the Lagrangian: \[ \mathcal{L}(x, y, \lambda) = 200\sqrt{x} + 300\sqrt{y} - \lambda(x + y - 100{,}000) \]
First-Order Conditions (FOCs)
\[\begin{align*} \frac{\partial \mathcal{L}}{\partial x} &= \frac{100}{\sqrt{x}} - \lambda = 0 \quad (1) \\ \frac{\partial \mathcal{L}}{\partial y} &= \frac{150}{\sqrt{y}} - \lambda = 0 \quad (2) \\ \frac{\partial \mathcal{L}}{\partial \lambda} &= x + y - 100{,}000 = 0 \quad (3) \end{align*}\]
From equations (1) and (2): \[ \frac{100}{\sqrt{x}} = \frac{150}{\sqrt{y}} \Rightarrow \frac{\sqrt{y}}{\sqrt{x}} = \frac{3}{2} \Rightarrow \frac{y}{x} = \left(\frac{3}{2}\right)^2 = \frac{9}{4} \Rightarrow y = \frac{9}{4}x \]
Substitute into the budget constraint: \[ x + \frac{9}{4}x = 100{,}000 \Rightarrow \frac{13}{4}x = 100{,}000 \Rightarrow x = \frac{400{,}000}{13} \approx 30{,}769.23 \] \[ y = \frac{9}{4}x = \frac{900{,}000}{13} \approx 69{,}230.77 \]
Second-Order Condition (SOC)
We verify the SOC using the bordered Hessian approach. Note that the constraint is linear and the second order derivative of linear function is zero. We have \[\nabla^2 \mathcal{L}(x,\lambda)_{x,y} = \nabla^2 R(x,\lambda)_{x,y}+\nabla^2 \lambda(x+y-100000)=\nabla^2 R(x,\lambda)_{x,y}\] We need only to calculate Hessian of \(R(x,y\) for SOC.
The second derivatives are: \[ \frac{\partial^2 R}{\partial x^2} = -100x^{-3/2} < 0, \quad \frac{\partial^2 R}{\partial y^2} = -150y^{-3/2} < 0 \]
Hessian matrix: \[ H = \begin{bmatrix} -100x^{-3/2} & 0 \\ 0 & -150y^{-3/2} \end{bmatrix} \]
Constraint gradient: \[ \nabla g = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]
Tangent space vectors satisfy: \[ d = \begin{bmatrix} 1 \\ -1 \end{bmatrix} \]
Evaluate the SOC: \[ d^\top \mathcal{L}_{x,y}d = d^\top R_{x,y}d = d^\top H d = -100x^{-3/2} - 150y^{-3/2} < 0 \]
Therefore, the Hessian is negative definite on the tangent space \(\Rightarrow\) local maximum.
Determine the maximum revenue
\[R(x^*,y^*)=200\sqrt{30769.23}+300\sqrt{69230.77}=114017.5\]
library(Rsolnp)
# Objective function to maximize — note: Rsolnp minimizes by default, so we negate it
fn <- function(par) {
x <- par[1]
y <- par[2]
return(-(200 * sqrt(x) + 300 * sqrt(y))) # negative for maximization
}
# Equality constraint function: x + y = 100000
eqfun <- function(par) {
return(sum(par))
}
# Equality bound
eqB <- 100000
# Initial guess
start <- c(50000, 50000)
# Solve with Rsolnp
res <- solnp(pars = start,
fun = fn,
eqfun = eqfun,
eqB = eqB,
LB = c(0, 0)) # lower bounds for x and y##
## Iter: 1 fn: -114017.5423 Pars: 30763.47320 69236.52680
## Iter: 2 fn: -114017.5423 Pars: 30763.47320 69236.52680
## solnp--> Completed in 2 iterations
r_sqrt_surface_with_constraint_plane_contour_plot
Definition 3.1 (KKT Conditions): For the problem: \[\min f(x) \quad \text{subject to} \quad g_i(x) \leq 0, \quad h_j(x) = 0\]
The Karush-Kuhn-Tucker conditions are:
Economic Interpretation: The multiplier \(\mu_i\) represents the shadow price of constraint \(i\). If \(\mu_i > 0\), the constraint is binding (active). If \(\mu_i = 0\), the constraint is not binding.
\[ \min_{x,y}\; f(x,y)=(x-5)^2+(y-5)^2 \]
Constraints.
\[
x+y=3,
\] \[
x+2y\ge 5.
\]
First-order conditions(KKT) Write the standard constraints \(h(x,y)=x+y-3=0\) and \(g(x,y)=5-(x+2y)\le 0\).
Lagrangian: \[ \mathcal L(x,y,\lambda,\mu)=(x-5)^2+(y-5)^2+\lambda\,h(x,y)+\mu\,g(x,y),\quad \mu\ge 0. \]
Stationarity: \[ \partial_x\mathcal L=2(x-5)+\lambda-\mu=0,\qquad \partial_y\mathcal L=2(y-5)+\lambda-2\mu=0. \] Feasibility: \(x+y=3,\; x+2y\ge 5\).
Dual feas./slackness: \(\mu\ge 0,\; \mu\,g(x,y)=0\).
Because the unconstrained projection onto \(x+y=3\) is \((1.5,1.5)\) and violates \(x+2y\ge 5\), the inequality is active at optimum: \(x+2y=5\). Solving \[ x+y=3,\quad x+2y=5 \;\Rightarrow\; (x^*,y^*)=(1,2). \] Multipliers from stationarity: \[ \lambda-\mu=8,\quad \lambda-2\mu=6 \;\Rightarrow\; (\lambda^*,\mu^*)=(10,2). \]
Second-order condition (SOC)
\(\nabla^2 f=2I\succ 0\) (strictly convex objective) and the constraints are affine, so the KKT point is the unique global minimizer.
# install.packages("Rsolnp") # if needed
library(Rsolnp)
# Objective
fn <- function(w){
x <- w[1]; y <- w[2]
(x - 5)^2 + (y - 5)^2
}
# Equality: x + y = 3
eqf <- function(w){
w[1] + w[2]
}
# Inequality: x + 2y >= 5 -> return x+2y with lower bound 5
ineqf <- function(w){
w[1] + 2*w[2]
}
ini <- c(0, 0)
res <- solnp(
pars = ini,
fun = fn,
eqfun = eqf, eqB = 3,
ineqfun = ineqf, ineqLB = 5, ineqUB = 1e6,
LB = c(-1e6, -1e6), UB = c(1e6, 1e6)
)##
## Iter: 1 fn: 25.0000 Pars: 1.00000 2.00000
## Iter: 2 fn: 25.0000 Pars: 1.00000 2.00000
## solnp--> Completed in 2 iterations
## [1] 1 2
## [1] 50 25 25
## --- KKT multipliers from stationarity ---
x <- res$pars[1]; y <- res$pars[2]
# Stationarity system:
# 2(x-5) + lambda - mu = 0
# 2(y-5) + lambda - 2*mu = 0
A <- rbind(c(1, -1), c(1, -2))
b <- c(-2*(x - 5), -2*(y - 5))
sol <- solve(A, b)
lambda <- sol[1]; mu <- sol[2]
list(x = x, y = y, lambda = lambda, mu = mu,
eq_ok = abs(x + y - 3) < 1e-8,
ineq_val = x + 2*y, ineq_active = abs(x + 2*y - 5) < 1e-8)## $x
## [1] 1
##
## $y
## [1] 2
##
## $lambda
## [1] 10
##
## $mu
## [1] 2
##
## $eq_ok
## [1] TRUE
##
## $ineq_val
## [1] 5
##
## $ineq_active
## [1] FALSE
A manufacturer produces two standard products, \(x\) and \(y\), and one premium product, \(z\).
The profit function (in thousands of dollars) is given by: \[
\pi(x,y,z) = 40x^{0.5}y^{0.3}z^{0.2} - 0.4x^2 - 0.3y^2 - 0.5z^2
\] where \(x, y, z \ge 0\) are
production quantities (in hundreds of units).
The firm faces the following constraints:
Resource constraint (equality)
The total labor used by all products must equal 2400 hours: \[
4x + 6y + 8z = 2400
\] where the coefficients represent labor hours per hundred
units.
Market constraint (inequality)
The premium product’s output cannot exceed 60% of the total production:
\[
z \le 0.6(x + y + z)
\]
Task: 1. Write down the Lagrangian function for the problem, introducing a Lagrange multiplier \(\lambda\) for the equality constraint and \(\mu\) for the inequality constraint. 2. State the Karush–Kuhn–Tucker (KKT) conditions. 3. Solve the first-order conditions along with the constraints to find the optimal \((x^*, y^*, z^*)\), the associated multipliers, and the maximum profit. 4. Interpret the economic meaning of the Lagrange multipliers \(\lambda\) and \(\mu\).
Hint: Remember that for the inequality constraint \(g(x,y,z) \le 0\), the KKT complementary slackness condition requires \(\mu \cdot g(x,y,z) = 0\) and \(\mu \ge 0\).
Solve the problem by solnp() in following R Environment
or in your local RStudio.
Real business optimization problems often combine multiple concepts: unconstrained optimization for initial analysis, equality constraints for budget limitations, and inequality constraints for practical restrictions.
A firm chooses production levels of two products, \(x\) and \(y\). Profit is given by
\[ \Pi(x,y) = 120x + 100y - 2x^2 - y^2 - xy . \]
This is an unconstrained maximisation problem.
\[ \frac{\partial \Pi}{\partial x} = 120 - 4x - y = 0, \quad \frac{\partial \Pi}{\partial y} = 100 - x - 2y = 0 \]
Solving:
\[ x^* = 20, \quad y^* = 40 \]
The Hessian is
\[ H = \begin{bmatrix} -4 & -1 \\ -1 & -2 \end{bmatrix} \]
\(\det(H) = 7 > 0\) and \(-4 < 0\) ⇒ \(H\) is negative definite ⇒ local (and here global) maximum.
# Profit function
Pi <- function(v) {
x <- v[1]; y <- v[2]
120*x + 100*y - 2*x^2 - y^2 - x*y
}
# Negate because optim() minimizes
negPi <- function(v) -Pi(v)
# Gradient of -Pi
grad_negPi <- function(v) {
x <- v[1]; y <- v[2]
gx <- -(120 - 4*x - y)
gy <- -(100 - x - 2*y)
c(gx, gy)
}
start <- c(0, 0)
fit <- optim(
par = start,
fn = negPi,
gr = grad_negPi,
method = "BFGS",
control = list(reltol = 1e-12)
)
fit$par # x*, y*## [1] 20.00000 39.99998
## [1] -3200
Contour Plot
Interactive 3D Surface Plot
We form a portfolio of two ASX stocks: CBA.AX and BHP.AX. Let weights be \(w_1\) (CBA) and \(w_2\) (BHP), with the budget constraint \(w_1 + w_2 = 1\). We maximize mean–variance utility with risk aversion \(\gamma>0\): \[ \max_{w_1,w_2}\; \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 -\frac{\gamma}{2}\Big[\hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2\Big]\]
\[ \text{s.t. } w_1+w_2=1.\]
Here \(\hat{\mu}_i\) are sample mean returns, and \[ \hat{\Sigma}= \begin{bmatrix} \hat{\sigma}_1^2 & \hat{\sigma}_{12}\\ \hat{\sigma}_{12} & \hat{\sigma}_2^2 \end{bmatrix} \quad\text{is the sample covariance matrix.} \]
We form a portfolio of two ASX stocks: CBA.AX
(Commonwealth Bank) and BHP.AX (BHP Group).
Let weights be \(w_1\) (CBA) and \(w_2\) (BHP), with the budget
constraint \(w_1 + w_2 =
1\).
We will estimate monthly log-return means and covariances from the last 5 years of data, then solve a mean–variance utility maximisation problem.
Let \(P_{i,t}\) be the adjusted closing price of asset \(i \in \{1,2\}\) at month \(t\) (with \(i=1\) for CBA, \(i=2\) for BHP). Define monthly log returns \[ r_{i t} \;=\; \ln\!\left(\frac{P_{i,t}}{P_{i,t-1}}\right), \qquad t=1,\dots,T. \]
Sample mean (monthly) \[ \hat{\mu} = \begin{bmatrix} \hat{\mu}_1\\[2pt] \hat{\mu}_2 \end{bmatrix} = \begin{bmatrix} \frac{1}{T}\sum_{t=1}^T r_{1t}\\[6pt] \frac{1}{T}\sum_{t=1}^T r_{2t} \end{bmatrix}. \]
Sample covariance (monthly) \[ \hat{\Sigma} = \begin{bmatrix} \hat{\sigma}_1^2 & \hat{\sigma}_{12}\\[2pt] \hat{\sigma}_{12} & \hat{\sigma}_2^2 \end{bmatrix}, \quad \hat{\sigma}_1^2=\frac{1}{T-1}\sum_{t=1}^T (r_{1t}-\hat{\mu}_1)^2,\; \hat{\sigma}_2^2=\frac{1}{T-1}\sum_{t=1}^T (r_{2t}-\hat{\mu}_2)^2,\; \hat{\sigma}_{12}=\frac{1}{T-1}\sum_{t=1}^T (r_{1t}-\hat{\mu}_1)(r_{2t}-\hat{\mu}_2). \]
(Optional annualisation: multiply means by \(12\) and standard deviations by \(\sqrt{12}\).)
Given risk aversion \(\gamma>0\), choose \((w_1,w_2)\) to maximise mean–variance utility: \[ \max_{w_1,w_2}\;\; \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 \;-\;\frac{\gamma}{2}\left( \hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2 \right) \quad\text{s.t.}\quad w_1 + w_2 = 1. \]
(No-shorting \(0\le w_i\le1\) can be added later; here we keep only the equality constraint.)
Lagrangian \[ \mathcal{L}(w_1,w_2,\lambda) = \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 -\frac{\gamma}{2}\!\left(\hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2\right) +\lambda(1-w_1-w_2). \]
First-order conditions (FOC) \[ \frac{\partial \mathcal{L}}{\partial w_1}:\quad \hat{\mu}_1 - \gamma(\hat{\sigma}_1^2 w_1 + \hat{\sigma}_{12} w_2) - \lambda = 0, \] \[ \frac{\partial \mathcal{L}}{\partial w_2}:\quad \hat{\mu}_2 - \gamma(\hat{\sigma}_{12} w_1 + \hat{\sigma}_2^2 w_2) - \lambda = 0, \] \[ \frac{\partial \mathcal{L}}{\partial \lambda}:\quad w_1 + w_2 = 1. \]
Second-order condition (SOC)
The Hessian with respect to \((w_1,w_2)\) is \[
H
=
-\gamma
\begin{bmatrix}
\hat{\sigma}_1^2 & \hat{\sigma}_{12}\\[2pt]
\hat{\sigma}_{12} & \hat{\sigma}_2^2
\end{bmatrix}.
\] Since \(\hat{\Sigma}\succ 0\)
and \(\gamma>0\), we have \(H\prec 0\) (negative definite), so the FOC
point is a unique global maximiser.
Subtract the two FOCs to eliminate \(\lambda\):
Substitute \(w_2 = 1 - w_1\) and solve for \(w_1\): \[ \boxed{ w_1^* = \frac{(\hat{\mu}_1 - \hat{\mu}_2)\;+\;\gamma(\hat{\sigma}_2^2 - \hat{\sigma}_{12})} {\gamma\big(\hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12}\big)} }, \qquad \boxed{\,w_2^* = 1 - w_1^*\, .} \]
\[ \boxed{ \lambda = \frac{ \big(\hat{\sigma}_2^2 - \hat{\sigma}_{12}\big)\hat{\mu}_1 \;+\; \big(\hat{\sigma}_1^2 - \hat{\sigma}_{12}\big)\hat{\mu}_2 \;-\; \gamma\big(\hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12}\big) }{ \hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12} } } \]
The denominator equals \(\mathrm{Var}(r_{1}-r_{2})=\hat{\sigma}_1^2+\hat{\sigma}_2^2-2\hat{\sigma}_{12}\), which is positive unless the assets are perfectly linearly dependent.
R code for empirical data extraction
# install.packages(c("tidyquant","dplyr","tidyr","lubridate"), dependencies = TRUE)
library(tidyquant)
library(dplyr)
library(tidyr)
library(lubridate)
symbols <- c("CBA.AX", "BHP.AX")
start <- floor_date(Sys.Date() - years(5), unit = "month")
end <- Sys.Date()
# 1) Download daily prices, then compute monthly log returns from adjusted prices
prices <- tq_get(symbols, from = start, to = end, get = "stock.prices")
monthly_ret <- prices %>%
group_by(symbol) %>%
tq_transmute(
select = adjusted,
mutate_fun = periodReturn,
period = "monthly",
type = "log",
col_rename = "ret"
) %>%
ungroup()
# 2) Wide matrix of returns: rows = months, columns = symbols
R <- monthly_ret %>%
select(symbol, date, ret) %>%
pivot_wider(names_from = symbol, values_from = ret) %>%
arrange(date) %>%
drop_na()
# 3) Empirical moments (monthly log returns)
r_mat <- as.matrix(R[, symbols])
mu <- colMeans(r_mat) # vector c(mu_CBA, mu_BHP)
Sigma <- cov(r_mat) # 2x2 covariance matrix
# Print results
mu## CBA.AX BHP.AX
## 0.018113866 0.009845225
## CBA.AX BHP.AX
## CBA.AX 0.0036015332 0.0001531928
## BHP.AX 0.0001531928 0.0047672853
R code for constrained optimization
library(Rsolnp)
# Three-asset portfolio optimization
mu <- c(0.08, 0.12) # Expected returns
Sigma <- matrix(c(0.04, 0.01,
0.01, 0.09), nrow = 2) # Covariance matrix
Omega <- Sigma
R<-mu
n=2
w = c(1/n,n)
fn1=function(w)
{
-t(R)%*%w + t(w)%*%Omega%*%w ### for gamma = 2
}
### the constraint that the weights sum to 1
eqn1=function(w){
return(sum(w))
}
### the initial value for numerical procedure
ini = c(0.5,0.5)
solution=solnp(ini, fun = fn1, eqfun = eqn1, eqB = c(1),LB=rep(0,n),UB=rep(1,n))##
## Iter: 1 fn: -0.06273 Pars: 0.54545 0.45455
## Iter: 2 fn: -0.06273 Pars: 0.54545 0.45455
## solnp--> Completed in 2 iterations
## [1] 0.5454546 0.4545454
## [,1]
## [1,] -0.06272727
## [1] 1
R provides several powerful tools for solving optimization problems. Understanding when and how to use each tool is crucial for practical problem-solving.
optim() - General-purpose optimizationconstrOptim() - Constrained optimizationlpSolve::lp() - Linear programmingquadprog::solve.QP() - Quadratic programming| Function / Package | Best For | Constraints Supported | Advantages | Limitations |
|---|---|---|---|---|
optim() |
Unconstrained problems or simple box constraints | None, box ("L-BFGS-B") |
Flexible, built-in, multiple methods (BFGS,
CG, Nelder-Mead, etc.) |
No general equality/inequality constraints |
constrOptim() |
Problems with linear inequality constraints | Linear inequality | Direct constraint handling, built into R | No equality constraints, slower for large problems |
lpSolve::lp() |
Linear objective with linear constraints | Linear equality & inequality | Very fast, guaranteed global optimum | Only linear problems |
quadprog::solve.QP() |
Quadratic objective with linear constraints | Linear equality & inequality | Efficient for mean–variance portfolio optimisation | Requires positive-definite quadratic term, linear constraints only |
Rsolnp::solnp() |
Nonlinear objective with equality/inequality constraints | Nonlinear equality & inequality + bounds | Handles general nonlinear constraints; supports bounds | Needs good starting values; may converge to local optima |
nloptr::nloptr() |
Large/complex nonlinear optimisation | Nonlinear equality & inequality + bounds | Many algorithms (global & local), gradient-free or gradient-based | Algorithm choice/tuning is critical |
DEoptim::DEoptim() |
Global optimisation, non-smooth or multi-modal problems | Bounds only | Robust to local minima; no gradient required | Slower for smooth problems; needs tuning population size & iterations |
GenSA::GenSA() |
Global optimisation over complex search spaces | Bounds only | Works for discrete or rugged landscapes; automatic annealing schedule | Can be slow; stochastic results vary |
optimx::optimx() |
Comparing multiple local optimisation methods | Same as optim() methods used |
Simple interface to try several methods and compare results | Constraint support limited to methods used |
Optimization is fundamental to business decision-making: -
Strategic Planning: Resource allocation and capacity
planning - Financial Management: Portfolio optimization
and risk management
- Operations: Production scheduling and supply chain
optimization - Marketing: Pricing strategies and budget
allocation
The progression from unconstrained to constrained optimization mirrors the evolution from theoretical models to practical business applications. Understanding both the mathematical theory and computational implementation enables effective problem-solving in complex business environments.
This concludes Section 5: Optimization. Students should now be equipped to tackle real-world optimization problems using both analytical and computational approaches.